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ABSTRACT 



A system and methods for matching media entities to 
associate closely related media entities are provided. In 
connection with a system that convergently merges percep- 
tual and digital signal processing analysis of media entities 
for purposes of classifying the media entities, various means 
are provided match media entities using aggregation and 
non-aggregation matching. In an illustrative 
implementation, various factors representative of inherent 
characteristics of the media entities are employed and pro- 
cessed to generate data sets having closely related and/or 
similarly situated media entities. Once a media matching is 
performed on a library of media entities, the results of the 
media match may be persisted for the user from experience 
to experience. 

20 Claims, 6 Drawing Sheets 
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MUSIC CONTENT CHARACTERISTIC vantage in that it involves guessing because the consumer 

IDENTIFICATION AND MATCHING has no familiarity with the musical composition that is 

selected. 

CROSS REFERENCE TO RELATED j n another approach, a merchant classifies musical com- 

APPLI CATION 5 positions into broad categories or genres. The disadvantage 

This application is related to and claims priority under 35 °J this a PP roach * that typically the genres are too broad. 

U.S.C. §119(e) to U.S. Provisional Application Ser. No. F ° r exam P le ' a ^ var f * of qt^tatively ^icnt ato 

60/215,807, filed Jul. 5, 2000, entitled "MUSIC MATCH- ™ d ™ n f may be classified m the genre of Popular Music 

ING PROCESS," the contents of which are hereby incor- or Rock and RolL 

porated by reference in their entirety. This application also In stl11 another approach, an online merchant presents a 

relates to U.S. patent application Ser. Nos. 09/900,230, scarch P a S e to a client associated with the consumer. The 

09/900,059 through 09/928,004. merchant receives selection criteria from the client for use in 

searching the merchant's catalog or database of available 

FIELD OF THE INVENTION music. Normally the selection criteria are limited to song 

15 name, album title, or artist name. The merchant searches the 

Hie present invention relates to a system and method to database based on the selectioD criteria and relurns a list of 

allow music matching for delivering music to users of matching results to the cUent. The cUent selects one item in 
computing devices connected to a network. the list and receives furtner , detailed information about that 

BACKGROUND OF THE INVENTION 20 item - merchant als ° crea ^ a "d returns one or more 

zu critics reviews, customer reviews, or past purchase mfor- 

dassifying information that has subjectively perceived mation associated with the item, 
attributes or characteristics is difficult. When the information For example, the merchant may present a review by a 
is one or more musical compositions, classification is com- music critic of a magazine that critiques the album selected 
plicated by the widely varying subjective perceptions of the by the client. The merchant may also present informal 
musical compositions by different listeners. One listener 25 reviews of the album that have been previously entered into 
may perceive a particular musical composition as "haunt- the system by other consumers. Further, the merchant may 
ingly beautiful" whereas another may perceive the same present suggestions of related music based on prior pur- 
composition as "annoyingly twangy." chases of others. For example, in the approach of 

In the classical music context, musicologists have devel- 3Q Amazon.com, when a client requests detailed information 

oped names for various attributes of musical compositions. about a particular album or song, the system displays 

Terms such as adagio, fortissimo, or allegro broadly describe information stating, "People who bought this album also 

the strength with which instruments in an orchestra should bought ..." followed by a list of other albums or songs. The 

be played to properly render a musical composition from list of other albums or songs is derived from actual purchase 

sheet music. In the popular music context, there is less 35 experience of the system. This is called "collaborative 

agreement upon proper terminology. Composers indicate filtering." 

how to render their musical compositions with annotations However, this approach has a significant disadvantage, 

such as brightly, softly, etc., but there is no consistent, namely that the suggested albums or songs are based on 

concise, agreed-upon system for such annotations. extrinsic similarity as indicated by purchase decisions of 

As a result of rapid movement of musical recordings from 40 others, rather than based upon objective similarity of intrin- 

sheet music to pre-recorded analog media to digital storage sic attributes of a requested album or song and the suggested 

and retrieval technologies, this problem has become acute. albums or songs. A decision by another consumer to pur- 

In particular, as large libraries of digital musical recordings chase two albums at the same time does not indicate that the 

have become available through global computer networks, a two albums are objectively similar or even that the consumer 

need has developed to classify individual musical compo- 45 liked both. For example, the consumer might have bought 

sitions in a quantitative manner based on highly subjective one for the consumer and the second for a third party having 

features, in order to facilitate rapid search and retrieval of greatly differing subjective taste than the consumer. As a 

large collections of compositions. result, some pundits have termed the prior approach as the 

Musical compositions and other information are now "greater fools" approach because it refies on the judgment of 

widely available for sampling and purchase over global 50 om ers. 

computer networks through online merchants such as ■ Another disadvantage of collaborative filtering is that 
Amazon.com, Inc., barnesandnoble.com, cdnow.com, etc. A output data is normally available only for complete albums 
prospective consumer can use a computer system equipped and not for individual songs. Thus, a first album that the 
with a standard Web browser to contact an online merchant, consumer likes may be broadly similar to second album, but 
browse an online catalog of pre-recorded music, select a 55 the second album may contain individual songs that are 
song or collection of songs ("album"), and purchase the song strikingly dissimilar from the first album, and the consumer 
or album for shipment direct to the consumer. In this context, has no way to detect or act on such dissimilarity, 
online merchants and others desire to assist the consumer in Still another disadvantage of collaborative filtering is that 
making a purchase selection and desire to suggest possible it requires a large mass of historical data in order to provide 
selections for purchase. However, current classification sys- go useful search results. The search results indicating what 
terns and search and retrieval systems are inadequate for others bought are only useful after a large number of 
these tasks. transactions, so that meaningful patterns and meaningful 
A variety of inadequate classification and search similarity emerge. Moreover, early transactions tend to 
approaches are now used. In one approach, a consumer over-influence later buyers, and popular titles tend to self- 
selects a musical composition for listening or for purchase 65 perpetuate. 

based on past positive experience with the same artist or In a related approach, the merchant may present infor- 

with similar music. This approach has a significant disad- mation describing a song or an album that is prepared and 
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distributed by the recording artist, a record label, or other 
entities that are commercially associated with the recording. 
A disadvantage of this information is that it may be biased, 
it may deliberately mischaracterize the recording in the hope 
of increasing its sales, and it is normally based on incon- 5 
sistent terms and meanings. 

In still another approach, digital signal processing (DSP) 
analysis is used to try to match characteristics from song to 
song, but DSP analysis alone has proven to be insufficient 
for classification purposes. While DSP analysis may be io 
effective for some groups or classes of songs, it is ineffective 
for others, and there has so far been no technique for 
determining what makes the technique effective for some 
music and not others. Specifically, such acoustical analysis 
as has been implemented thus far suffers defects because 1) 15 
the effectiveness of the analysis is being questioned regard- 
ing the accuracy of the results, thus diminishing the per- 
ceived quality by the user and 2) recommendations can only 
be made if the user manually types in a desired artist or song 
title from that specific website. Accordingly, DSP analysis, 20 
by itself, is unreliable and thus insufficient for widespread 
commercial or other use. 

Accordingly, there is a need for an improved method of 
classifying information that is characterized by the conver- 
gence of subjective or perceptual analysis and DSP acous- 25 
tical analysis criteria. With such a classification technique, it 
would be further desirable to leverage song-by-song analysis 
and matching capabilities to automatically and/or dynami- 
cally personalize a network-based experience for a user. In 
this regard, there is a need for a mechanism that can enable 30 
a client to automatically retrieve information about one or 
more musical compositions, user preferences, ratings, or 
other sources of mappings to enhance an experience for 
listeners). 

SUMMARY OF THE INVENTION 

In view of the foregoing, the present invention provides a 
system and methods that allow for automatic matching of 
music based on various music characteristics for delivery to 
participating users. In connection with a system that con- 
vergently merges perceptual and digital signal processing 
analysis of media entities for purposes of classifying the 
media entities in an effort to provide relevant media to 
participating users, the present invention provides various 
means to generate play lists, perform 'soundslike' searches, 45 
and, media matching based on meta-data, such as artist or 
title matching. Techniques for providing for weighting 
media to perform desired matches are also provided. Once 
a music matching is performed for a library of media to 
create a roll-up table is performed, the results of the music 50 
matching process may be persisted for a user from experi- 
ence to experience. 

Other features of the present invention are described 
below. 55 

BRIEF DESCRIPTION OF THE DRAWINGS 

The system and methods for the automatic transmission of 
new, high affinity media are further described with reference 
to the accompanying drawings in which: 60 

FIG. 1 is a block diagram representing an exemplary 
network environment in which the present invention may be 
implemented; 

FIG. 2 is a high level block diagram representing the 
media content classification system utilized to classify 65 
media, such as music, in accordance with the present inven- 
tion; 
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FIG. 3 is block diagram illustrating an exemplary method 
of the generation of general media classification rules from 
analyzing the convergence of classification in part based 
upon subjective and in part based upon digital signal pro- 
cessing techniques; 

FIG. 4 is a flow diagram of the processing performed and 
data flow realized to perform matching of media elements in 
accordance with the present invention; and 

FIGS. 5 and 5A illustrates the processing of an exemplary 
algorithm to perform score-based matching for a song-to- 
song matching process in accordance with the present inven- 
tion. 

DETAILED DESCRIPTION OF PREFERRED 
EMBODIMENTS 

Overview 

The present invention provides systems and methods to 
closely relate media entities based on one or more inherent 
media entity characteristics). Closely related and/or simi- 
larly situated (i.e. "matched") media entities may subse- 
quently be grouped to create play lists. For example, com- 
monly assigned U.S. patent application Ser. No. 09/905,011, 
filed Jul. 13, 2001, entitled "Dynamic Playlist of Media", 
herein after playlist generation, describes novel techniques 
for dynamically generating playlists of closely related media 
entities. Alternatively, the operations of the present inven- 
tion may be employed to perform searches among an 
aggregated list of media entities (e.g. songs), such that 
similarity searches (e.g. sounds-like or looks like searches) 
may be performed, or further, exact media entity author 
and/or title searches may be performed. The dynamic play- 
list generation system enables the creation and distribution 
of playlists to participating users wherein closely related 
media entities are gathered for inclusion within a given 
generated playlist. The present invention, among other 
implementations, powers the playlist generation system by 
providing systems and methods that associate closely related 
and/or similarly situated media entities with each other using 
inherent media entity characteristics. 
Exemplary Computer and Network Environments 

One of ordinary skill in the art can appreciate that a 
computer 110 or other client device can be deployed as part 
of a computer network. In this regard, the present invention 
pertains to any computer system having any number of 
memory or storage units, and any number of applications 
and processes occurring across any number of storage units 
or volumes. The present invention may apply to an envi- 
ronment with server computers and client computers 
deployed in a network environment, having remote or local 
storage. The present invention may also apply to a standa- 
lone computing device, having access to appropriate clas- 
sification data. 

FIG. 1 illustrates an exemplary network environment, 
with a server in communication with client computers via a 
network, in which the present invention may be employed. 
As shown, a number of servers 10a, lOfa, etc., are intercon- 
nected via a communications network 14, which may be a 
LAN, WAN, intranet, the Internet, etc., with a number of 
client or remote computing devices 110a, HOfc, 110c, 1104 
llOe, etc., such as a portable computer, handheld computer, 
thin client, networked appliance, or other device, such as a 
VCR, TV, and the like in accordance with the present 
invention. It is thus contemplated that the present invention 
may apply to any computing device in connection with 
which it is desirable to provide classification services for 
different types of content such as music, video, other audio, 
etc. In a network environment in which the communications 
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network 14 is the Internet, for example, the servers 10 can the songs in database 240 may be leveraged to determine the 

be Web servers with which the clients 110a, 110b, 110c, attributes, qualities, genre, likelihood of success, etc. of the 

110a", llOe, etc. communicate via any of a number of known new music. In effect, the rules can be used as a filter to 

protocols such as hypertext transfer protocol (HTTP). Com- supplement any other decision making processes with 

munications may be wired or wireless, where appropriate. 5 respect to the new music. 

Client devices 110 may or may not communicate via com- FIG. 3 illustrates an embodiment of the invention, which 
munications network 14, and may have independent com- generates generalized rules for a classification system. A first 
munications associated therewith. For example, in the case goal is to train a database with enough songs so that the 
of a TV or VCR, there may or may not be a networked aspect human and automated classification processes converge, 
to the control thereof. Each client computer 110 and server 10 from which a consistent set of classification rules may be 
computer 10 may be equipped with various application adopted, and adjusted to accuracy. First, at 305, a general set 
program modules 135 and with connections or access to of classifications are agreed upon in order to proceed con- 
various types of storage elements or objects, across which sistently i.e., a consistent set of terminology is used to 
files may be stored or to which portion(s) of files may be classify music in accordance with the present invention. At 
downloaded or migrated. Any server 10a, 10b, etc. may be 15 310, a first level of expert classification is implemented, 
responsible for the maintenance and updating of a database whereby experts classify a set of training songs in database 
20 in accordance with the present invention, such as a 300. This first level of expert is fewer in number than a 
database 20 for storing classification information, music second level of expert, termed herein a groover, and in 
and/or software incident thereto. Thus, the present invention theory has greater expertise in classifying music than the 
can be utilized in a computer network environment having 20 second level of expert or groover. The songs in database 300 
client computers 110a, 1106, etc. for accessing and inter- may originate from anywhere, and are intended to represent 
acting with a computer network 14 and server computers a broad cross-section of music. At 320, the groovers imple- 
10a, 10£>, etc. for interacting with client computers 110a, ment a second level of expert classification. There is a 
1106, etc. and other devices 111 and databases 20. training process in accordance with the invention by which 
Classification 25 groovers learn to consistently classify music, for example to 
In accordance with one aspect of the present invention, a 92-95% accuracy. The groover scrutiny reevaluates the 
unique classification is implemented which combines classification of 310, and reclassifies the music at 325 if the 
human and machine classification techniques in a conver- groover determines that reassignment should be performed 
gent manner, from which a canonical set of rules for before storing the song in human classified training song 
classifying music may be developed, and from which a 30 database 330. 

database, or other storage element, may be filled with Before, after or at the same time as the human classifi- 

classified songs. With such techniques and rules, radio cation process, the songs from database 300 are classified 

stations, studios and/or anyone else with an interest in according to digital signal processing (DSP) techniques at 

classifying music can classify new music. With such a 340. Exemplary classifications for songs include, inter alia, 

database, music association may be implemented in real 35 tempo, sonic, melodic movement and musical consonance 

time, so that play lists or lists of related (or unrelated if the characterizations. Classifications for other types of media, 

case requires) media entities may be generated. Play lists such as video or software are also contemplated. The quan- 

may be generated, for example, from a single song and/or a titative machine classifications and qualitative human clas- 

user preference profile in accordance with an appropriate silica lions for a given piece of media, such as a song, are 

analysis and matching algorithm performed on the data store 40 then placed into what is referred to herein as a classification 

of the database. Nearest neighbor and/or other matching chain, which may be an array or other list of vectors, 

algorithms may be utilized to locate songs that are similar to wherein each vector contains the machine and human clas- 

the single song and/or are suited to the user profile. sification attributes assigned to the piece of media. Machine 

FIG. 2 illustrates an exemplary classification technique in learning classification module 350 marries the classifica- 

accordance with the present invention. Media entities, such 45 tions made by humans and the classifications made by 

as songs 210, from wherever retrieved or found, are classi- machines, and in particular, creates a rule when a trend 

fied according to human classification techniques at 220 and meets certain criteria. For example, if songs with heavy 

also classified according to automated computerized DSP activity in the frequency spectrum at 3 kHz, as determined 

classification techniques at 230. 220 and 230 may be per- by the DSP processing, are also characterized as 'jazzy* by 

formed in either order, as shown by the dashed lines, because 50 humans, a rule can be created to this effect. The rule would 

it is the marriage or convergence of the two analyses that be, for example: songs with heavy activity at 3 kHz are 

provides a stable set of classified songs at 240. As discussed jazzy. Thus, when enough data yields a rule, machine 

above, once such a database of songs is classified according learning classification module 350 outputs a rule to rule set 

to both human and automated techniques, the database 360. While this example alone may be an oversimplification, 

becomes a powerful tool for generating songs with a playlist 55 since music patterns are considerably more complex, it can 

generator 250. A playlist generator 250 may take input(s) be appreciated that certain DSP analyses correlate well to 

regarding song attributes or qualities, which may be a song human analyses. 

or user preferences, and may output a playlist, recommend However, once a rule is created, it is not considered a 

other songs to a user, filter new music, etc. depending upon generalized rule. The rule is then tested against like pieces 

the goal of using the relational information provided by the 60 of media, such as song(s), in the database 370. If the rule 

invention. In the case of a song as an input, first, a DSP works for the generalization song(s) 370, the rule is con- 

analysis of the input song is performed to determine the sidered generalized. The rule is then subjected to groover 

attributes, qualities, likelihood of success, etc. of the song. scrutiny 380 to determine if it is an accurate rule at 385. If 

In the case of user preferences as an input, a search may be the rule is inaccurate according to groover scrutiny, the rule 

performed for songs that match the user preferences to 65 is adjusted. If the rule is considered to be accurate, then the 

create a playlist or make recommendations for new music. rule is kept as a relational rule e.g., that may classify new 

In the case of filtering new music, the rules used to classify media. 
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The above -described technique thus maps a pre-defined match index 430. Alternatively, if at block 425, match by 

parameter space to a psycho acoustic perceptual space aggregation is preferred, processing proceeds to block 435 

defined by musical experts. This mapping enables content- where song popularity data set 435 is summed with 

based searching of media, which in part enables the auto- weighted edges data set at summation block 440. The 

matic transmission of high affinity media content, as 5 resultant data set is collated at block 450 to produce a 

described below. resultant data set 445. Resultant data set 445 contains data 

Matching Processing elements where edges are sorted into buckets by source 

The present invention relates generally associating aggregation. Processing then proceeds to blocks 460 and 
closely related media entities using one ore more inherent block 475. At block 460, data set 455 is sorted so that the 
characteristics of the media entities. In an illustrative 10 sinks are sorted by weight. The resultant set from sort 460 
implementation, the inventive concepts described herein is aggregate-to-song match index data set 465. Concurrently, 
may comprise on or more features for use in broadcasting or data set 455 is collated at block 475 to produce data set 480 
rendering of media from a network-enabled computing where the edge buckets are grouped by sinks and each 
device, such as a radio, or a radio broadcast rendered via a bucket is grouped by source. A sort is then performed at 
network portal, such as a Web site. The matching process 15 block 485 of data set 480 such that the buckets are sorted by 
works to power one or more features for the above-described weight. A resultant data set 490 is generated from sort 480 
song analysis and matching system. Specifically, the present so that data set 490 contains an aggregate-to-aggregate 
invention employs one more data sets having media content match index. As a result, when performing aggregate match- 
data (e.g. song data), and media entity characteristics (e.g. ing aggregate-to-song match index data set 465 and 
factor similarities matrices, characteristic scales, and char- 20 aggregate- to- aggregate match index data set 490. 
acteristic weights) to pair down closely related media enti- A number of data sets are used to execute matching 
ties from a library of disparate media entities. One or more algorithm of block 410. Specifically, song data set is 
algorithms operate on these data sets to create one or more employed. Prior to the matching process, each song is 
subsets of media entities having close relationships. In analyzed, and data is created which describes the intrinsic 
operation, the algorithms are applied to the data sets to 25 musical properties of the song. This data can either be a 
generate a running scores so that media entities may be scaled number, which implies that the numeric designation 
compared with each other on a media entity by media entity has significant mathematical meaning, or the data can be an 
bases to quantitatively separate dissimilar media entities and attribute whose number has no inherent meaning. In the 
associate similarly situated media entities. second case, a data similarity matrix data set is required to 

Existing media matching processes to identify closely 30 mathematically assign a value to the relationship between 

related media entities are very broad and have not specifi- two data values. The similarity data matrix may comprise 

cally employed characteristics of the media itself when various factors including but not limited to similarity driven 

trying to create relevant associations, and hence cannot as factors (e.g. rhythm time, rhythm type, style, sub-genre, 

effectively provide consistent and relevant matches between vocal voices, flavor, and mood), scale driven factors 

similarly situated media entities. 35 (emotion, density, weight, consonance, melodic movement, 

In connection with the above-described song analysis, tempo, and rhythm activity), 

classification and matching processes, the present invention In addition, the present invention contemplates the use of 

provides advancements in the area of matching closely factor similarity matrices (e.g. inherent characteristic 

related media entities, that when implemented serve to matrices). Generally, each non -scale driven data set has 

power features and operations for media distribution sys- 40 associated similarity matrices that describe the importance 

tems offering to participating users that ability to retrieve of that data in determining a song's musical affinity to 

one or more highly related media entity sets via only a small another. These matrices are flexible. Each matrix driven 

amount of effort. By lever aging the song analysis and factor has a base matrix used for most cases, but there are 

matching techniques, users can accurately "ask" for music also specialized matrices for some factors based upon the 

for which there will be high affinity. Using pre-defined 45 style of the song being analyzed, or, alternatively, the 

media entity characteristics in one more weighting algo- historical preferences of the individual user, 

rithms and processes, a library of media entities may be Further, factor scales are employed. For data not associ- 

sorted into sets contained therein closely related and/or ated via the data similarity matrices, the use of customized 

similarly situated media entities. scales for each different factor are employed. In doing so, the 

FIGS. 4 through 5A illustrate the processing that is 50 difference between a score of 10 and 5 in emotion is made 

performed to produce media entity sets having closely more or less significant than the difference between scores of 

related media entities. an 8 and a 3, for instance. This architecture allows more 

FIG. 4 shows the processing flow and data flow of the flexibility in how to interpret the song data, 

present invention to pair down a library of media entities Lastly, factor weights are utilized in the matching process, 

(e.g. songs) to generate one or more sets of matched media 55 Some musical factors are more important than others in 

entities using score matching and/or aggregation matching determining musical affinity. Therefore, a table of factor 

techniques. As shown, song data 400 and similarity matrices weights is used to describe the relative importance of the 

405 act as input for song matching algorithm 410. Song individual song factors. 

matching algorithm (as further described by FIGS. 5 and Matching algorithm of processing block 410 may be 

5A), operate on song data 400 and similarity matrices 405 to 60 employed to find the closest set of songs in terms of musical 

generate match scores 415. Match scores serve as a basis to affinity to one particular song. This algorithm is repeated for 

quantify similarities between songs in song data set 405. each song in order to create a complete rollup table for an 

Once the match scores are generated at 415, a check is exemplary library of songs. As such, the matching algorithm 

performed at block 420 to determine if a match is to be makes a massively parallel implementation feasible, the 

performed using aggregation techniques. If aggregation is 65 extreme of which would be one computing node per song, 

not to be used, processing proceeds to block 425 where the In an illustrative implementation, the resulting values 

songs are sorted by score. The resultant set is song-to-song from the exemplary matching algorithm are pre-computed 
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into a lookup table. In operation, the lookup table may be that was previously marked as the N" best match is removed 

employed by a media content delivery computing applica- from this good match queue and marked as "non-similar". A 

tion (e.g. a music content Web-site) to facilitate fast lookups check is performed at block 560 to determine if there are 

of the song meta-data computcd-by the matching algorithm. more songs to compare the reference song against. If there 

The exemplary matching algorithm may be employed to 5 are no more songs for comparison, processing terminates at 

match a single song against a library of songs. block 464. However, if the alternative proves true, process- 

F1GS. 5 and 5A describe the processing performed by ing reverts to block 530 and proceeds there from. The result 
exemplary algorithm of processing block 410 when per- of this processing is that only the closest N songs are stored 
forming the single song match. As shown, in FIG. 5, because many songs may be thrown out after comparing the 
processing begins at block 500 where the search space 10 two or three most significant attributes, instead of cycling 
against which a given song (e.g. reference song) is pared through them all. To efficiently implement the algorithm, the 
according to the sub-genre of the reference song. A sub- successfully matched (song, score) pairs may be stored in a 
genre comprises on of the similarity matrix driven factors priority queue (e.g. a specialized binary tree such as a heap), 
and/or scale driven factors described above. The sub-genre Once all of the songs in the search space have been 
of the song being matched is cross-referenced against the 15 compared, the remaining N best matches to that song are 
sub -genre similarity matrix to create a numerical score. For placed in a rollup table, ordered by song affinity (score) for 
the sub-genres that do not have a corresponding entry in the efficient indexing from an exemplary Web application, 
similarity matrix, all songs in those sub-genres are climi- As described in FIG. 4, the present invention contem- 
nated from the search space at block 505. For the songs that plates an alternative to score based matching, namely, aggre- 
remain, they are placed in buckets according to sub-genre at 20 gate matching. Relying on the score matches, the present 
block 510. A comparison of the scores is then performed at invention contemplates a matching process to aggregate the 
block 515 to determine the sub-genre having the lowest results into more than just a media entity by media entity 
cross-referenced score to the referenced song. The songs matching (e.g. song by song matching), but also artist to 
from this sub-genre are selected at bloc 520 to prune down artist matching, or other aggregate forms of media entity 
the search space and to seed the matching process for greater 25 matching. For example, the following matches are conte in- 
efficiency, plated media entity to media entity (e.g. song to songs), 

Once the search space has been pruned, processing pro- author to authors, artist to artists, album to albums, user- 

ceeds to block 530 of block FIG. 5 A, where the song is now selected media entities to media entities (e.g. songs to 

compared against every other song in the remaining alive set songs), artist to media entity (e.g. songs), album to media 

of songs (i.e. songs from the initial library of songs that 30 entities (e.g. songs), album to artist, and artist to album, 

survive processing steps 500-520. A song match score is Aggregate matching may employ a node counting algo- 

kept for the comparison between the two songs at block 535. rithm to perform desired matches for a given library of 

This comparison involves looping through each of the song media entities. In an illustrative implementation, aggregate 

factors listed above. For each factor, either the similarity matching may be realized by generating a graph, having an 

value is looked up in the similarity matrix, or a value is 35 edge connecting each media entity (e.g. song) that has a 

computed using the factor scales (the default scale is linear pre-defined range match to another media entity (e.g. top 50 

difference). In the special case of song tempo, the relative match to another song). Each edge has a direction, going 

difference between the two songs is measured, as opposed to from the source (the media entity being matched against) to 

using a simple scale. The resulting value in all three cases is the sink (the second media entity, which "matches" the 

then squared, and then multiplied by the factor weight at 40 media entity). The set of media entities for a particular 

block 540. The following exemplary equations represent the aggregation is examined after this. This set of media entities 

processing performed to realize scorekeeping for song A may be generated by processing steps 400-415 of FIG. 4. To 

(e.g. the song being matched against) and song B (e.g. a song achieve aggregation, the processed media entity data set 

in the search space). should contain all the media entity to media entity data for 

Factor e Matrix__driven 45 all the media entities in the aggregation being queried. When 

processed, each edge is given a weight that is relative to its 

score^oorc+factoweightlfactorl-CsimilarityteactorlC&ctoEA, fee- ranking, or other factor, such as, the source's general popu- 

totB ^ larity. The popularity factor may be used to ensure that 

Factor eScale driven end-users observe aggregate matching based upon familiar 

~~ 50 songs from an artist or album. Comparatively, if popularity 

score=score+factorweight{factor3*(sca]c[factor]CfactoD4)-scaic[fac- is not employed, obscure selections may be used resulting in 

tor](factortf) 2 a i ess than personal experience for the end-user. 

In the exemplary algorithm, the weights of the edges are 

factor = tempo: score = summed and categorized according to their sinks so that 

factorA- factorB \ 2 55 Q&c ^ mecna entity which acts as a sink gets at least one 

score + factorweigh^factor] *y °^ acro ^ ° r * lOOj bucket. As a result, those media entities that match several 

of the aggregated songs will have multiple "edges" in their 
bucket. After summation of the edge weights, media entities 

As the score for the comparison between songs A and B is with multiple edges will have much higher weights than 

incremented, the score is compared against the best N scores 60 others, and therefore "match" that aggregate of media enti- 

seen so far at block 545. If the score goes above that ties better than others. This weighting is used to create the 

threshold at any point, the comparison is terminated and individual media entity affinities to the aggregation. For 

songs A and B are marked as "non-similar" at block 555. If example, an artist or an album to which the reference media 

songs A and B are similar enough (i.e. they have a low entity is being matched against. The result of the aggregation 

enough score) then song B is placed in a list, along with its 65 algorithm provides one or more media entities that have a 

score, with the other (N-l) closest songs in terms of musical close relationship to a particular classifier. For example, if an 

affinity at block 550. When a good match is found, the match artist is provided as the classifier, the aggregation algorithm 
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would provide a resultant data set having media entities that 
would have characteristics most similar to the provided artist 
(e.g. if the media entity is a song the aggregation algorithm 
would provide a resultant data set having songs the would 
sound most similar to the provided artist). 5 

In addition, aggregations of media entities on either side 
of the match process may be performed by the present 
invention. For example, an aggregation may be performed to 
match one artist to other artists, the data first generated by 
the aggregation algorithm is aggregated once more to obtain 10 
similarly situated artists (e.g. similarly sounding artists). 

As mentioned above, the media contemplated by the 
present invention in all of its various embodiments is not 
limited to music or songs, but rather the invention applies to 
any media to which a classification technique may be is 
applied that merges perceptual (human) analysis with acous- 
tic (DSP) analysis for increased accuracy in classification 
and matching. 

The various techniques described herein may be imple- 
mented with hardware or software or, where appropriate, 20 
with a combination of both. Thus, the methods and apparatus 
of the present invention, or certain aspects or portions 
thereof, may take the form of program code (i.e., 
instructions) embodied in tangible media, such as floppy 
diskettes, CD-ROMs, hard drives, or any other machine- 25 
readable storage medium, wherein, when the program code 
is loaded into and executed by a machine, such as a 
computer, the machine becomes an apparatus for practicing 
the invention. In the case of program code execution on 
programmable computers, the computer will generally 30 
include a processor, a storage medium readable by the 
processor (including volatile and non-volatile memory and/ 
or storage elements), at least one input device, and at least 
one output device. One or more programs are preferably 
implemented in a high level procedural or object oriented 35 
programming language to communicate with a computer 
system. However, the program(s) can be implemented in 
assembly or machine language, if desired. In any case, the 
language may be a compiled or interpreted language, and 
combined with hardware implementations. 40 

The methods and apparatus of the present invention may 
also be embodied in the form of program code that is 
transmitted over some transmission medium, such as over 
electrical wiring or cabling, through fiber optics, or via any 
other form of transmission, wherein, when the program code 45 
is received and loaded into and executed by a machine, such 
as an EPROM, a gate array, a programmable logic device 
(PLD), a client computer, a video recorder or the like, the 
machine becomes an apparatus for practicing the invention. 
When implemented on a general-purpose processor, the 50 
program code combines with the processor to provide a 
unique apparatus that operates to perform the indexing 
functionality of the present invention. For example, the 
storage techniques used in connection with the present 
invention may invariably be a combination of hardware and 55 
software. 

While the present invention has been described in con- 
nection with the preferred embodiments of the various 
figures, it is to be understood that other similar embodiments 
may be used or modifications and additions may be made to 60 
the described embodiment for performing the same function 
of the present invention without deviating there from. For 
example, while exemplary embodiments of the invention are 
described in the context of music data, one skilled in the art 
will recognize that the present invention is not limited to the 65 
music, and that the methods of tailoring media to a user, as 
described in the present application may apply to any 
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computing device or environment, such as a gaming 
console, handheld computer, portable computer, etc., 
whether wired or wireless, and may be applied to any 
number of such computing devices connected via a com- 
munications network, and interacting across the network. 
Furthermore, it should be emphasized that a variety of 
computer platforms, including handheld device operating 
systems and other application specific operating systems are 
contemplated, especially as the number of wireless net- 
worked devices continues to proliferate. Therefore, the 
present invention should not be limited to any single 
embodiment, but rather construed in breadth and scope in 
accordance with the appended claims. 
What is claimed is: 

1. A method for selecting media entities that closely relate 
to a reference media entity from a data set having at least one 
media entity comprising the steps of: 

producing media entities match scores for said at least one 
media entity of said data set by comparing media entity 
affinity data of reference song with media affinity data 
of said at least one media entity of said data set, 
wherein said media affinity data comprises at least one 
factor from a similarity matrix; 

storing media entity match scores for media entities 
having scores nearest the score to the reference media 
entity for selecting the closest set of media entities to 
said reference media entity; and 

generating a resultant data set of media entities using said 
produced and stored match scores, wherein said result- 
ant data set containing closely related media entities to 
said reference media entity. 

2. The method as recited in claim 1, wherein the produc- 
ing step further comprising the step of pruning down said 
data set having at least one media entity using sub-genre 
factors to eliminate dissimilar media entities from being 
processed in said selection method. 

3. The method as recited in claim 2, further comprising 
the steps of determining at least one sub-genre for said 
reference media entity and comparing said sub-genre with 
said sub-genre factors of said at least one media of said data 
set, said sub-genre factors comprising at least one of rhythm 
time, rhythm type, style, vocal voices, flavor, mood, 
emotion, density, weight, consonance, melodic movement, 
tempo, or rhythm activity. 

4. The method as recited in claim 3, further comprising 
the step of applying at least one scoring -based algorithm on 
said data set having said at least one media entity to produce 
said matching scores, said algorithm employing said sub- 
genre factors to produce said matching scores. 

5. The method as recited in claim 1, wherein said pro- 
ducing step further comprises the step of performing a 
look-up on at least one similarity matrix to obtain said 
matching score. 

6. The method as recited in claim 1, further comprising 
the step of associating N number of media entities having the 
closest relationship based on said matching scores with said 
reference media entity. 

7. The method as recited in claim 1, further comprising 
the step of producing a roll-up table, said roll-up table 
having at least one look-up index to facilitate searching of 
similarly matched songs. 

8. The method as recited in claim 1, further comprising 
the step of performing aggregate matching to produce a data 
set of matched media entities matched at least one combi- 
nation of various media entity characteristics, said charac- 
teristics comprising at least one of media entity author, 
media entity title, or user-defined search parameters. 
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9. The method as recited in claim 8, wherein aggregate 
matching further comprises the steps of: 

creating a graph with an edge connecting each media 
entity having a pre-determined range of match to 
another media entity, wherein each edge has a direction 5 
going from a source (the media entity being matched 
against) to the sink (the second media entity, which 
matches the first media entity); 

assigning a weight to each edge relative to its ranking 
within said range; 30 

summing the edge weights; and 

categorizing said summed edge weights for said media 
entities according to their sinks. 

10. A computer readable medium bearing computer 2 5 
executable instructions for carrying out the method of claim 

1. 

11. A modulated data signal carrying computer executable 
instructions for carrying out the method of claim 1. 

12. A computing device comprising means for carrying 2 o 
out each of the steps of the method of claim 1. 

13. A system for matching media entities to provide 
associations between closely related media entities compris- 
ing: 

a scoring system that accepts as input data sets represen- 25 
tative of media entity data and similarity matrices data, 
wherein said scoring system applies at least one 
weighting algorithm on said media entity data and said 
similarity matrices data to calculate matched scores, 
said algorithm employing inherent media element char- 30 
acteristics when calculate said matched scores; and 

a data store cooperating with said scoring system to store 
said matched scores for media entities having scores 
nearest the score to a selected reference media entity. 

14. The system as recited in claim 13, further comprising 35 
an aggregation system, said aggregation system employing 

at least one node counting algorithm to assign weights to 
media entities, wherein said media entities are sorted accord- 
ing to said assigned weights to create associations between 
relevant and closely related media entities. 40 

15. The system as recited in claim 14, wherein said at least 
one node counting algorithm creates a graph with an edge 
connecting each media entity having a pre-determined range 
of match to another media entity, wherein each edge has a 
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direction going from a source (the media entity being 
matched against) to the sink (the second media entity, which 
matches the first media entity), assigning a weight to each 
edge relative to its ranking within said range, summing the 
edge weights, and categorizing said summed edge weights 
for said media entities according to their sinks. 

16. The system as recited in claim 13, wherein said media 
element characteristics comprise at least one of rhythm time, 
rhythm type, style, vocal voices, flavor, mood, emotion, 
density, weight, consonance, melodic movement, tempo, or 
rhythm activity. 

17. The system as recited in claim 13, wherein said data 
store stores said media elements in a roll-up table having at 
least one index that is accessible by participating users. 

18. The system as recited in claim 13, wherein said 
scoring system compares a reference media entity to a 
library of media entities, said reference entity being com- 
pared to said library of media entities using media entity 
characteristics to calculate said matched scores, said scoring 
system associating media entities as closely related if said 
calculated matched score is below a pre-defined threshold 
score. 

19. The system recited in claim 13, further comprising a 
communications network cooperating with said scoring sys- 
tem and said data store to communicate data representative 
of matched media entities and media entities to participating 
users, said communications network comprising any of a 
fixed-wire public network, a wireless public network, a fixed 
wire private network, and a wireless private network. 

20. A method to enhance the experience of participating 
users of a media entity distribution system comprising the 
steps of: 

providing a library of media entities; 

applying media entity matching processes on said library 
of media elements to create associations between the 
media entities of said library using inherent character- 
istics of said media elements, wherein said media entity 
matching processes employ at least one of a score - 
based matching or node counting matching; 

storing for distribution said associations in a roll-up table 
having one or more indices, said roll-up table provided 
to said participating users. 

***** 
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